High-dimensional data arises in numerous applications, and the rapidly developing field of geometric deep learning seeks to develop neural network architectures to analyze such data in non-Euclidean domains, such as graphs and manifolds. Recent work by Z. Wang, L. Ruiz, and A. Ribeiro has introduced a method for constructing manifold neural networks using the spectral decomposition of the Laplace Beltrami operator. Moreover, in this work, the authors provide a numerical scheme for implementing such neural networks when the manifold is unknown and one only has access to finitely many sample points. The authors show that this scheme, which relies upon building a data-driven graph, converges to the continuum limit as the number of sample points tends to infinity. Here, we build upon this result by establishing a rate of convergence that depends on the intrinsic dimension of the manifold but is independent of the ambient dimension. We also discuss how the rate of convergence depends on the depth of the network and the number of filters used in each layer.
Nonnegative matrix factorization can be used to automatically detect topics within a corpus in an unsupervised fashion. The technique amounts to an approximation of a nonnegative matrix as the product of two nonnegative matrices of lower rank. In this paper, we show this factorization can be combined with regression on a continuous response variable. In practice, the method performs better than regression done after topics are identified and retrains interpretability.
Media bias can significantly impact the formation and development of opinions and sentiments in a population. It is thus important to study the emergence and development of partisan media and political polarization. However, it is challenging to quantitatively infer the ideological positions of media outlets. In this paper, we present a quantitative framework to infer both political bias and content quality of media outlets from text, and we illustrate this framework with empirical experiments with real-world data. We apply a bidirectional long short-term memory (LSTM) neural network to a data set of more than 1 million tweets to generate a two-dimensional ideological-bias and content-quality measurement for each tweet. We then infer a ``media-bias chart'' of (bias, quality) coordinates for the media outlets by integrating the (bias, quality) measurements of the tweets of the media outlets. We also apply a variety of baseline machine-learning methods, such as a naive-Bayes method and a support-vector machine (SVM), to infer the bias and quality values for each tweet. All of these baseline approaches are based on a bag-of-words approach. We find that the LSTM-network approach has the best performance of the examined methods. Our results illustrate the importance of leveraging word order into machine-learning methods in text analysis.
We present sketched linear discriminant analysis, an iterative randomized approach to binary-class Gaussian model linear discriminant analysis (LDA) for very large data. We harness a least squares formulation and mobilize the stochastic gradient descent framework. Therefore, we obtain a randomized classifier with performance that is very comparable to that of full data LDA while requiring access to only one row of the training data at a time. We present convergence guarantees for the sketched predictions on new data within a fixed number of iterations. These guarantees account for both the Gaussian modeling assumptions on the data and algorithmic randomness from the sketching procedure. Finally, we demonstrate performance with varying step-sizes and numbers of iterations. Our numerical experiments demonstrate that sketched LDA can offer a very viable alternative to full data LDA when the data may be too large for full data analysis.
最近,“ SP”(随机Polyak步长)方法已成为一种竞争自适应方法,用于设置SGD的步骤尺寸。SP可以解释为专门针对插值模型的方法,因为它求解了插值方程。SP通过使用模型的局部线性化来求解这些方程。我们进一步迈出一步,并开发一种解决模型局部二阶近似的插值方程的方法。我们最终的方法SP2使用Hessian-Vector产品来加快SP的收敛性。此外,在二阶方法中,SP2的设计绝不依赖于正定的Hessian矩阵或目标函数的凸度。我们显示SP2在矩阵完成,非凸测试问题和逻辑回归方面非常有竞争力。我们还提供了关于Quadratics总和的融合理论。
在线张量分解(OTF)是一种从流媒体多模态数据学习低维解释特征的基本工具。虽然最近已经调查了OTF的各种算法和理论方面,但仍然甚至缺乏任何不连贯或稀疏假设的客观函数的静止点的一般会聚保证仍然缺乏仍然缺乏缺乏。案件。在这项工作中,我们介绍了一种新颖的算法,该算法从一般约束下的给定的张力值数据流中学习了CANDECOMP / PARAFAC(CP),包括诱导学习CP的解释性的非承诺约束。我们证明我们的算法几乎肯定会收敛到目标函数的一组静止点,在该假设下,数据张集的序列由底层马尔可夫链产生。我们的环境涵盖了古典的i.i.d.案例以及广泛的应用程序上下文,包括由独立或MCMC采样生成的数据流。我们的结果缩小了OTF和在线矩阵分解在全局融合分析中的OTF和在线矩阵分解之间的差距\ Commhl {对于CP - 分解}。实验,我们表明我们的算法比合成和实际数据的非负张量分解任务的标准算法更快地收敛得多。此外,我们通过图像,视频和时间序列数据展示了我们算法对来自图像,视频和时间序列数据的多样化示例的实用性,示出了通过以多种方式利用张量结构来利用张量结构,如何从相同的张量数据中学习定性不同的CP字典。 。
